static-bytes and endianess

The Haskell package static-bytes was spun out of the pantry package. Ilias Tsitsimpis reported that static-bytes-0.1.0 did not work as intended on big-endian machine architectures, specifically IBM’s s390x. A big-endian architecture stores the most significant byte of a multi-byte word at the lowest memory address. A little-endian architecture does the opposite. x86_64 is little-endian and AArch64 is little-endian on most machines.

DynamicBytes and StaticBytes

The static-bytes-0.1.0 package makes use of two type classes DynamicBytes and StaticBytes. The functions promised by the classes are not exported. The types provided by the package (Bytes8 to Bytes128) are instances of StaticBytes. Types that provide sequences of (8-bit) bytes are instances of DynamicBytes (currently, Data.ByteString.ByteString, Data.Vector.Primitive.Vector word8, Rio.Vector.Storable.Vector word8 – a re-export from Data.Vector.Storable – and Rio.Vector.Unboxed.Vector word8 – a re-export from Data.Vector.Unboxed).

The type classes are used in the constraints of conversion functions. The most tolerant one to convert from dynamic to static is toStaticPadTruncate:

fromStatic converts from static to dynamic:

Let’s look at Data.ByteString.ByteString and Bytes16 as an example.

ByteString and Bytes16

The implementation of toStaticPadTruncate is:

withPeekD is promised by the DynamicBytes class. For the ByteString instance, it is:

toForeignPtr :: ByteString -> (ForeignPtr Word8, Int, Int) deconstructs a ForeignPtr from a ByteString. The first Int is the offset (which is 0) and the second Int is the length of the string of bytes in bytes.

withPeekForeign is:

inner, given a function that takes an Int (an offset) and yields an action providing a Word64, yields an action that provides a value of type b (in this instance sbytes). In this case, the function given to inner is f. f puts up to 7 bytes into a Word64 on the assumption that the first bytes are the least significant or it puts 8 bytes directly into a Word64. It seems to me that, on a big-endian machine, the latter will treat the first bytes as the most significant.

In this case, inner is usePeekS 0. usePeekS is promised by the StaticBytes class. For the Bytes16 instance, it is:

The data constructor of Bytes16 is not exposed, but it is Bytes16 !Bytes8 !Bytes8. Bytes8 is simply a newtype for Word64. The first field is filled first and then the second field is filled.

For the Bytes8 instance, usePeekS is:

The implementation of fromStatic is:

The first function to be applied in the definition of fromStatic, toWordsS, is promised by the StaticBytes class. For the Bytes8 instance it is:

and for the Bytes16 instance it is:

The second field is added to the list of Word64 and then the first field is added. So, toWordS b is adding to the head of a list of Word64 and the least signficant Word64 is added last.

The next function to be applied, ($ []) :: ([a] -> b) - > b, starts the list of Word64 with an empty list.

The final function to be applied in the definition of fromStatic, fromWordsD (lengthS (Nothing :: Maybe sbytes)), is promised (fromWordsD) by the DynamicBytes class. For the ByteString instance it is:

The implementation of fromWordsForeign is:

I assume that the behaviour of pokeElemOff :: Ptr Word64 -> Int -> Word64 ->IO () depends on the endianess of the machine architecture.

The fromWordsD of the ByteString instance is also used to implement the Bytes8 instance of Show, as follows:

Enforcing endianess

The fix was to make use of helper functions to enforce the endianess of Word64 values (with names inspired by the names of similar functions provided by the cpu package):

So, in withPeekForeign we have:

and in fromWordsForeign we have: